home *** CD-ROM | disk | FTP | other *** search
Text File | 1993-07-08 | 58.6 KB | 2,215 lines |
-
-
- Characters and character sets for various languages
-
- Thu Jun 17 12:29:46 MET DST 1993
-
-
- Harald Tveit Alvestrand
- SINTEF DELAB
- Harald.Alvestrand@delab.sintef.no
-
-
-
- Abstract
-
- There is a need to have a source of information about the
- characters that are used in various languages. No such information
- is currently readily available on the net. This document attempts
- to fill that void.
-
-
- Status of this Memo
-
- This draft document is being circulated for comment.
- It does not yet cover anything but Latin-based scripts; volunteers
- to collect material for other scripts are sought.
-
- Please send comments to the author, or to the RARE WG-CHAR list
- <wg-char@rare.nl>.
-
- The following text is required by the Internet-draft rules:
-
- This document is an Internet Draft. Internet Drafts are working
- documents of the Internet Engineering Task Force (IETF), its
- Areas, and its Working Groups. Note that other groups may also
- distribute working documents as Internet Drafts.
-
- Internet Drafts are draft documents valid for a maximum of six
- months. Internet Drafts may be updated, replaced, or obsoleted by
- other documents at any time. It is not appropriate to use
- Internet Drafts as reference material or to cite them other than
- as a "working draft" or "work in progress."
-
- Please check the I-D abstract listing contained in each Internet
- Draft directory to learn the current status of this or any other
- Internet Draft.
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 1]
-
- draft Languages and character sets Mar 93
-
-
- 1. Introduction
-
- There are a lot of languages in the world. Estimates vary between
- 500 and 6000, with some eternal conflicts about the difference
- between a language and a dialect guaranteeing that any list
- claiming to be authoritative will be the source of endless debate.
-
- Many of these languages have a writing system. Some have several.
- These are also likely to have changed over time, with the meaning
- of character symbols changing, the shape of the characters
- changing, or completely new characters being added, or old ones
- removed from the set. This means that even within a single
- language, a list of characters is likely to be controversial.
-
- These problems have made several experts in the field of languages
- and characters refuse to even consider the idea of working out
- such a list.
-
- Nevertheless, it is clear that an easily available source of this
- kind of information is needed, in order to:
-
-
- (1) Identify the problems encountered when trying to use
- equipment with limited character support for a language
-
- (2) Identify what support for additional characters will be
- "enough" for that language
-
- (3) Identify what internationally standardized character sets are
- able to fulfill the requirements for that languag
-
-
- The tables given below are an attempt at providing such an
- identification.
-
- The rest of the document is in 3 parts: The language tables a
-
-
-
-
- 2. Introduction to language tables
-
-
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 2]
-
- draft Languages and character sets Mar 93
-
-
- 2.1. Table structure
-
- Each language is listed in 4 parts:
-
-
- (1) The language name with its ISO 639 code if applicable
-
- (2) The characters required for that language. For brevity, the
- characters of ASCII (A-Z) are not listed. Note that some
- languages do NOT require all the ASCII characters.
-
- (3) Characters that are in normal use, but have replacements that
- mostly do not change the meaning of the word in context.
- These may be called "optional" characters. This should _not_
- be taken as liberty to remove those characters from the
- language, but as a reminder that if it is great trouble to
- use the charsets that cover the complete language, a smaller
- character set may be used without causing grievous harm to
- the expressive power of the writer.
-
- (4) Internationally registered character sets that cover the
- required and/or optional characters for that language.
-
- (5) Comments
-
- The division between "required" and "optional" characters is
- likely to produce much discussion. As a rough guide, I have
- taken the registered ISO 646 variants of a number of
- countries, and classified as "optional" all characters which
- did _not_ appear in that ISO 646 variant. As a result, an ISO
- 646 variant should appear under the "required characters
- only" for all languages that have an ISO 646 variant.
-
- Note that for brevity, only the lower case version of the
- character is listed. If no note is made, one should assume
- that the upper case version is equally required.
-
- Note, however, that a lot of languages permit the dropping of
- accents on upper case characters where it would be considered
- improper to drop them on lower case characters.
-
-
-
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 3]
-
- draft Languages and character sets Mar 93
-
-
- 2.2. Sources utilized
-
- The table of Latin-script languages is based on work by Johan van
- Wingen. <BUTPAA@rulmvs.leidenuniv.nl>. The others are best
- guesses by the author.
-
- The tables of character sets prepared by Keld Jorn Simonsen
- <keld@dkuug.dk> (RFC-KELD) were invaluable in matching the data on
- languages to the data on character sets.
-
- The language codes (for those languages that have codes) come from
- ISO 639.
-
- NOTE: ISO 639 is a very incomplete list of the world's languages
- (perhaps 10 or 20 % according to some experts), and is undergoing
- revision. The only reason for using it is that it is the only
- ISO-standardized shorthand notation for languages available at the
- moment.
-
- Languages for which no such exact information is known are listed
- at the end of the tables.
-
-
- 2.3. What accents mean
-
- For those who feel unfamiliar with the names of accents:
-
-
- Grave
- slants upwards to the left, like the Unix "backtick".
-
-
- Acute
- slants upwards to the right.
-
-
- Circumflex
- looks like a little pointed hat.
-
-
- Tilde
- looks like a wavy line.
-
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 4]
-
- draft Languages and character sets Mar 93
-
-
- Macron
- looks like a bar placed on top of the character.
-
-
- Breve
- looks like the lower quarter of a circle, placed on top of
- the character.
-
-
- Dot above
- should be self-explanatory.
-
-
- Diaeresis
- looks like 2 dots above the character.
-
-
- Ring above
- should be self-explanatory.
-
-
- Cedilla
- looks like a little squiggle on the bottom of the letter,
- down and then left.
-
-
- Ogonek
- looks like a squiggle too, but goes down and to the right.
-
-
- Caron
- looks like a little "v" on top of the character.
-
-
- 3. Language tables This language has no known character set
-
-
- 3.1. lt Lithuanian
-
- Required characters
-
- a; 0105 LATIN SMALL LETTER A WITH OGONEK
- e; 0119 LATIN SMALL LETTER E WITH OGONEK
- i; 012f LATIN SMALL LETTER I WITH OGONEK
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 5]
-
- draft Languages and character sets Mar 93
-
-
- u; 0173 LATIN SMALL LETTER U WITH OGONEK
- e. 0117 LATIN SMALL LETTER E WITH DOT ABOVE
- u- 016b LATIN SMALL LETTER U WITH MACRON
- c< 010d LATIN SMALL LETTER C WITH CARON
- s< 0161 LATIN SMALL LETTER S WITH CARON
- z< 017e LATIN SMALL LETTER Z WITH CARON
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- T.61-8bit (iso 103)
- ISO_8859-4:1988 (iso 110)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- latin6 (iso 157)
- JIS_X0212-1990 (iso 159)
-
-
- 3.2. lv Latvian
-
- Required characters
-
- a- 0101 LATIN SMALL LETTER A WITH MACRON
- e- 0113 LATIN SMALL LETTER E WITH MACRON
- i- 012b LATIN SMALL LETTER I WITH MACRON
- o- 014d LATIN SMALL LETTER O WITH MACRON
- u- 016b LATIN SMALL LETTER U WITH MACRON
- g, 0123 LATIN SMALL LETTER G WITH CEDILLA
- k, 0137 LATIN SMALL LETTER K WITH CEDILLA
- l, 013c LATIN SMALL LETTER L WITH CEDILLA
- n, 0146 LATIN SMALL LETTER N WITH CEDILLA
- r, 0157 LATIN SMALL LETTER R WITH CEDILLA
- c< 010d LATIN SMALL LETTER C WITH CARON
- s< 0161 LATIN SMALL LETTER S WITH CARON
- z< 017e LATIN SMALL LETTER Z WITH CARON
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- T.61-8bit (iso 103)
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 6]
-
- draft Languages and character sets Mar 93
-
-
- ISO_8859-4:1988 (iso 110)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- latin6 (iso 157)
-
-
- 3.3. et Estonian
-
- Required characters
-
- o? 00f5 LATIN SMALL LETTER O WITH TILDE
- a: 00e4 LATIN SMALL LETTER A WITH DIAERESIS
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
- s< 0161 LATIN SMALL LETTER S WITH CARON
- z< 017e LATIN SMALL LETTER Z WITH CARON
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- T.61-8bit (iso 103)
- ISO_8859-4:1988 (iso 110)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- latin6 (iso 157)
- JIS_X0212-1990 (iso 159)
-
-
- 3.4. fi Finnish
-
- Required characters
-
- a: 00e4 LATIN SMALL LETTER A WITH DIAERESIS
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
-
- Character sets covering the whole
-
- NATS-SEFI (iso 8)
- NATS-DANO-ADD (iso 9)
- SEN_850200_B (iso 10)
- SEN_850200_C (iso 11)
- DIN_66003 (iso 21)
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 7]
-
- draft Languages and character sets Mar 93
-
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- ISO_8859-2:1987 (iso 101)
- T.61-8bit (iso 103)
- ISO_8859-3:1988 (iso 109)
- ISO_8859-4:1988 (iso 110)
- T.101-G2 (iso 128)
- CSN_369103 (iso 139)
- ISO_6937-2-add (iso 142)
- ISO_8859-9:1989 (iso 148)
- latin6 (iso 157)
- JIS_X0212-1990 (iso 159)
-
-
- 3.5. ?? Sami
-
- Required characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- a> 00e2 LATIN SMALL LETTER A WITH CIRCUMFLEX
- a: 00e4 LATIN SMALL LETTER A WITH DIAERESIS
- e: 00eb LATIN SMALL LETTER E WITH DIAERESIS
- i: 00ef LATIN SMALL LETTER I WITH DIAERESIS
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
- ae 00e6 LATIN SMALL LETTER AE
- aa 00e5 LATIN SMALL LETTER A WITH RING ABOVE
- o/ 00f8 LATIN SMALL LETTER O WITH STROKE
- d/ 0111 LATIN SMALL LETTER D WITH STROKE
- n' 0144 LATIN SMALL LETTER N WITH ACUTE
- ng 014b LATIN SMALL LETTER ENG
- t/ 0167 LATIN SMALL LETTER T WITH STROKE
- c< 010d LATIN SMALL LETTER C WITH CARON
- s< 0161 LATIN SMALL LETTER S WITH CARON
- z< 017e LATIN SMALL LETTER Z WITH CARON
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 8]
-
- draft Languages and character sets Mar 93
-
-
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- latin6 (iso 157)
- JIS_X0212-1990 (iso 159)
-
-
- 3.6. sv Swedish
-
- Required characters
-
- a: 00e4 LATIN SMALL LETTER A WITH DIAERESIS
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
- aa 00e5 LATIN SMALL LETTER A WITH RING ABOVE
-
- Optional characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- e: 00eb LATIN SMALL LETTER E WITH DIAERESIS
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- ISO_8859-4:1988 (iso 110)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- ISO_8859-9:1989 (iso 148)
- latin6 (iso 157)
- JIS_X0212-1990 (iso 159)
-
- Character sets covering the required characters only
-
- NATS-SEFI (iso 8)
- SEN_850200_B (iso 10)
- SEN_850200_C (iso 11)
-
-
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 9]
-
- draft Languages and character sets Mar 93
-
-
- 3.7. no Norwegian
-
- Required characters
-
- ae 00e6 LATIN SMALL LETTER AE
- aa 00e5 LATIN SMALL LETTER A WITH RING ABOVE
- o/ 00f8 LATIN SMALL LETTER O WITH STROKE
-
- Optional characters
-
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- o> 00f4 LATIN SMALL LETTER O WITH CIRCUMFLEX
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- ISO_8859-9:1989 (iso 148)
- latin6 (iso 157)
- JIS_X0212-1990 (iso 159)
-
- Character sets covering the required characters only
-
- NATS-DANO (iso 9)
- NS_4551-1 (iso 60)
- NS_4551-2 (iso 61)
- ISO_8859-4:1988 (iso 110)
-
-
- 3.8. da Danish
-
- Required characters
-
- ae 00e6 LATIN SMALL LETTER AE
- aa 00e5 LATIN SMALL LETTER A WITH RING ABOVE
- o/ 00f8 LATIN SMALL LETTER O WITH STROKE
-
- Optional characters
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 10]
-
- draft Languages and character sets Mar 93
-
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- y' 00fd LATIN SMALL LETTER Y WITH ACUTE
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- latin6 (iso 157)
- JIS_X0212-1990 (iso 159)
-
- Character sets covering the required characters only
-
- NATS-DANO (iso 9)
- NS_4551-1 (iso 60)
- NS_4551-2 (iso 61)
- ISO_8859-4:1988 (iso 110)
- ISO_8859-9:1989 (iso 148)
-
-
- 3.9. fo Faeroese
-
- Required characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- y' 00fd LATIN SMALL LETTER Y WITH ACUTE
- ae 00e6 LATIN SMALL LETTER AE
- o/ 00f8 LATIN SMALL LETTER O WITH STROKE
- d- 00f0 LATIN SMALL LETTER ETH (Icelandic)
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 11]
-
- draft Languages and character sets Mar 93
-
-
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- latin6 (iso 157)
- JIS_X0212-1990 (iso 159)
-
-
- 3.10. is Icelandic
-
- Required characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- y' 00fd LATIN SMALL LETTER Y WITH ACUTE
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
- ae 00e6 LATIN SMALL LETTER AE
- d- 00f0 LATIN SMALL LETTER ETH (Icelandic)
- th 00fe LATIN SMALL LETTER THORN (Icelandic)
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- latin6 (iso 157)
- JIS_X0212-1990 (iso 159)
-
-
- 3.11. kl Greenlandic
-
- Required characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 12]
-
- draft Languages and character sets Mar 93
-
-
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- a> 00e2 LATIN SMALL LETTER A WITH CIRCUMFLEX
- e> 00ea LATIN SMALL LETTER E WITH CIRCUMFLEX
- i> 00ee LATIN SMALL LETTER I WITH CIRCUMFLEX
- o> 00f4 LATIN SMALL LETTER O WITH CIRCUMFLEX
- u> 00fb LATIN SMALL LETTER U WITH CIRCUMFLEX
- ae 00e6 LATIN SMALL LETTER AE
- aa 00e5 LATIN SMALL LETTER A WITH RING ABOVE
- o/ 00f8 LATIN SMALL LETTER O WITH STROKE
- a? 00e3 LATIN SMALL LETTER A WITH TILDE
- i? 0129 LATIN SMALL LETTER I WITH TILDE
- u? 0169 LATIN SMALL LETTER U WITH TILDE
- kk 0138 LATIN SMALL LETTER KRA (Greenlandic)
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
-
-
- 3.12. ?? Gaelic
-
- Required characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- a! 00e0 LATIN SMALL LETTER A WITH GRAVE
- e! 00e8 LATIN SMALL LETTER E WITH GRAVE
- i! 00ec LATIN SMALL LETTER I WITH GRAVE
- o! 00f2 LATIN SMALL LETTER O WITH GRAVE
- u! 00f9 LATIN SMALL LETTER U WITH GRAVE
-
- Character sets covering the whole
-
- GB_2312-80 (iso 58)
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 13]
-
- draft Languages and character sets Mar 93
-
-
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- ISO_8859-3:1988 (iso 109)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- JIS_X0212-1990 (iso 159)
-
-
- 3.13. ga Irish
-
- Required characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
-
- Character sets covering the whole
-
- GB_2312-80 (iso 58)
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- ISO_8859-2:1987 (iso 101)
- T.61-8bit (iso 103)
- ISO_8859-3:1988 (iso 109)
- CSA_Z243.4-1985-gr (iso 123)
- T.101-G2 (iso 128)
- CSN_369103 (iso 139)
- ISO_6937-2-add (iso 142)
- ISO_8859-9:1989 (iso 148)
- latin6 (iso 157)
- JIS_X0212-1990 (iso 159)
-
-
- 3.14. cy Welsh
-
- Required characters
-
- w' 1e83 LATIN SMALL LETTER W WITH ACUTE
- y' 00fd LATIN SMALL LETTER Y WITH ACUTE
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 14]
-
- draft Languages and character sets Mar 93
-
-
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- a! 00e0 LATIN SMALL LETTER A WITH GRAVE
- e! 00e8 LATIN SMALL LETTER E WITH GRAVE
- i! 00ec LATIN SMALL LETTER I WITH GRAVE
- o! 00f2 LATIN SMALL LETTER O WITH GRAVE
- u! 00f9 LATIN SMALL LETTER U WITH GRAVE
- w! 1e81 LATIN SMALL LETTER W WITH GRAVE
- y! 1ef3 LATIN SMALL LETTER Y WITH GRAVE
- a> 00e2 LATIN SMALL LETTER A WITH CIRCUMFLEX
- e> 00ea LATIN SMALL LETTER E WITH CIRCUMFLEX
- i> 00ee LATIN SMALL LETTER I WITH CIRCUMFLEX
- o> 00f4 LATIN SMALL LETTER O WITH CIRCUMFLEX
- u> 00fb LATIN SMALL LETTER U WITH CIRCUMFLEX
- w> 0175 LATIN SMALL LETTER W WITH CIRCUMFLEX
- y> 0177 LATIN SMALL LETTER Y WITH CIRCUMFLEX
- a: 00e4 LATIN SMALL LETTER A WITH DIAERESIS
- e: 00eb LATIN SMALL LETTER E WITH DIAERESIS
- i: 00ef LATIN SMALL LETTER I WITH DIAERESIS
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
- w: 1e85 LATIN SMALL LETTER W WITH DIAERESIS
- y: 00ff LATIN SMALL LETTER Y WITH DIAERESIS
- This language has no known character set
-
-
- 3.15. br Breton
-
- Required characters
-
- e> 00ea LATIN SMALL LETTER E WITH CIRCUMFLEX
- u! 00f9 LATIN SMALL LETTER U WITH GRAVE
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
- n? 00f1 LATIN SMALL LETTER N WITH TILDE
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 15]
-
- draft Languages and character sets Mar 93
-
-
- ISO_8859-3:1988 (iso 109)
- CSA_Z243.4-1985-gr (iso 123)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- JIS_X0212-1990 (iso 159)
-
-
- 3.16. fy Frisian
-
- Required characters
-
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- a> 00e2 LATIN SMALL LETTER A WITH CIRCUMFLEX
- e> 00ea LATIN SMALL LETTER E WITH CIRCUMFLEX
- o> 00f4 LATIN SMALL LETTER O WITH CIRCUMFLEX
- u> 00fb LATIN SMALL LETTER U WITH CIRCUMFLEX
- a: 00e4 LATIN SMALL LETTER A WITH DIAERESIS
- e: 00eb LATIN SMALL LETTER E WITH DIAERESIS
- i: 00ef LATIN SMALL LETTER I WITH DIAERESIS
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- ISO_8859-3:1988 (iso 109)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
-
-
- 3.17. nl Dutch
-
- Required characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 16]
-
- draft Languages and character sets Mar 93
-
-
- a: 00e4 LATIN SMALL LETTER A WITH DIAERESIS
- e: 00eb LATIN SMALL LETTER E WITH DIAERESIS
- i: 00ef LATIN SMALL LETTER I WITH DIAERESIS
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
- ij 0133 LATIN SMALL LIGATURE IJ
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- JIS_X0212-1990 (iso 159)
-
-
- 3.18. af Afrikaans
-
- Required characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- e! 00e8 LATIN SMALL LETTER E WITH GRAVE
- a> 00e2 LATIN SMALL LETTER A WITH CIRCUMFLEX
- e> 00ea LATIN SMALL LETTER E WITH CIRCUMFLEX
- i> 00ee LATIN SMALL LETTER I WITH CIRCUMFLEX
- o> 00f4 LATIN SMALL LETTER O WITH CIRCUMFLEX
- u> 00fb LATIN SMALL LETTER U WITH CIRCUMFLEX
- e: 00eb LATIN SMALL LETTER E WITH DIAERESIS
- i: 00ef LATIN SMALL LETTER I WITH DIAERESIS
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
-
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 17]
-
- draft Languages and character sets Mar 93
-
-
- 3.19. de German
-
- Required characters
-
- a: 00e4 LATIN SMALL LETTER A WITH DIAERESIS
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
- ss 00df LATIN SMALL LETTER SHARP S (German)
-
- Optional characters
-
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- a! 00e0 LATIN SMALL LETTER A WITH GRAVE
-
- Comments
-
- The "ss" character exists only in lower case; the upper case
- equivalent is "SS" (2 letters).
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- ISO_8859-3:1988 (iso 109)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- ISO_8859-9:1989 (iso 148)
- JIS_X0212-1990 (iso 159)
-
- Character sets covering the required characters only
-
- DIN_66003 (iso 21)
- ISO_8859-2:1987 (iso 101)
- ISO_8859-4:1988 (iso 110)
- CSN_369103 (iso 139)
- latin6 (iso 157)
-
-
-
-
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 18]
-
- draft Languages and character sets Mar 93
-
-
- 3.20. fr French
-
- Required characters
-
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- e! 00e8 LATIN SMALL LETTER E WITH GRAVE
- u! 00f9 LATIN SMALL LETTER U WITH GRAVE
- c, 00e7 LATIN SMALL LETTER C WITH CEDILLA
- a! 00e0 LATIN SMALL LETTER A WITH GRAVE
-
- Optional characters
-
- a> 00e2 LATIN SMALL LETTER A WITH CIRCUMFLEX
- e> 00ea LATIN SMALL LETTER E WITH CIRCUMFLEX
- i> 00ee LATIN SMALL LETTER I WITH CIRCUMFLEX
- o> 00f4 LATIN SMALL LETTER O WITH CIRCUMFLEX
- u> 00fb LATIN SMALL LETTER U WITH CIRCUMFLEX
- ae 00e6 LATIN SMALL LETTER AE
- oe 0153 LATIN SMALL LIGATURE OE
- e: 00eb LATIN SMALL LETTER E WITH DIAERESIS
- i: 00ef LATIN SMALL LETTER I WITH DIAERESIS
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
- y: 00ff LATIN SMALL LETTER Y WITH DIAERESIS
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
-
- Character sets covering the required characters only
-
- IT (iso 15)
- NF_Z_62-010_(1973) (iso 25)
- NF_Z_62-010 (iso 69)
- ISO_8859-1:1987 (iso 100)
- ISO_8859-3:1988 (iso 109)
- CSA_Z243.4-1985-1 (iso 121)
- CSA_Z243.4-1985-2 (iso 122)
- CSA_Z243.4-1985-gr (iso 123)
- ISO_8859-9:1989 (iso 148)
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 19]
-
- draft Languages and character sets Mar 93
-
-
- JIS_X0212-1990 (iso 159)
-
-
- 3.21. ca Catalan
-
- Required characters
-
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- a! 00e0 LATIN SMALL LETTER A WITH GRAVE
- e! 00e8 LATIN SMALL LETTER E WITH GRAVE
- o! 00f2 LATIN SMALL LETTER O WITH GRAVE
- i: 00ef LATIN SMALL LETTER I WITH DIAERESIS
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
- l. 0140 LATIN SMALL LETTER L WITH MIDDLE DOT
- n? 00f1 LATIN SMALL LETTER N WITH TILDE
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- JIS_X0212-1990 (iso 159)
-
-
- 3.22. es Spanish
-
- Required characters
-
- n? 00f1 LATIN SMALL LETTER N WITH TILDE
- c, 00e7 LATIN SMALL LETTER C WITH CEDILLA
- !I 00a1 INVERTED EXCLAMATION MARK
- ?I 00bf INVERTED QUESTION MARK
-
- Optional characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 20]
-
- draft Languages and character sets Mar 93
-
-
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
- n? 00f1 LATIN SMALL LETTER N WITH TILDE
-
- Comments
-
- Note that this language also uses special punctuation marks. The
- c, appears in ISO 646-ES, but not in van Wingen's tables.
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- CSA_Z243.4-1985-gr (iso 123)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- ISO_8859-9:1989 (iso 148)
- JIS_X0212-1990 (iso 159)
-
- Character sets covering the required characters only
-
- ES (iso 17)
- ES2 (iso 85)
-
-
- 3.23. gl Galician
-
- Required characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
- n? 00f1 LATIN SMALL LETTER N WITH TILDE
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 21]
-
- draft Languages and character sets Mar 93
-
-
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- ISO_8859-3:1988 (iso 109)
- CSA_Z243.4-1985-gr (iso 123)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- ISO_8859-9:1989 (iso 148)
- JIS_X0212-1990 (iso 159)
-
-
- 3.24. pt Portuguese
-
- Required characters
-
- a? 00e3 LATIN SMALL LETTER A WITH TILDE
- o? 00f5 LATIN SMALL LETTER O WITH TILDE
- c, 00e7 LATIN SMALL LETTER C WITH CEDILLA
-
- Optional characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- a! 00e0 LATIN SMALL LETTER A WITH GRAVE
- a> 00e2 LATIN SMALL LETTER A WITH CIRCUMFLEX
- e> 00ea LATIN SMALL LETTER E WITH CIRCUMFLEX
- o> 00f4 LATIN SMALL LETTER O WITH CIRCUMFLEX
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- JIS_X0212-1990 (iso 159)
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 22]
-
- draft Languages and character sets Mar 93
-
-
- Character sets covering the required characters only
-
- PT (iso 16)
- PT2 (iso 84)
- ISO_8859-9:1989 (iso 148)
-
-
- 3.25. eu Basque
-
- Required characters
-
- n? 00f1 LATIN SMALL LETTER N WITH TILDE
- c, 00e7 LATIN SMALL LETTER C WITH CEDILLA
-
- Character sets covering the whole
-
- ES (iso 17)
- videotex-suppl (iso 70)
- ES2 (iso 85)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- ISO_8859-3:1988 (iso 109)
- CSA_Z243.4-1985-gr (iso 123)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- ISO_8859-9:1989 (iso 148)
- JIS_X0212-1990 (iso 159)
-
-
- 3.26. mt Maltese
-
- Required characters
-
- a! 00e0 LATIN SMALL LETTER A WITH GRAVE
- e! 00e8 LATIN SMALL LETTER E WITH GRAVE
- i! 00ec LATIN SMALL LETTER I WITH GRAVE
- o! 00f2 LATIN SMALL LETTER O WITH GRAVE
- u! 00f9 LATIN SMALL LETTER U WITH GRAVE
- i> 00ee LATIN SMALL LETTER I WITH CIRCUMFLEX
- c. 010b LATIN SMALL LETTER C WITH DOT ABOVE
- g. 0121 LATIN SMALL LETTER G WITH DOT ABOVE
- h/ 0127 LATIN SMALL LETTER H WITH STROKE
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 23]
-
- draft Languages and character sets Mar 93
-
-
- z. 017c LATIN SMALL LETTER Z WITH DOT ABOVE
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- T.61-8bit (iso 103)
- ISO_8859-3:1988 (iso 109)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- JIS_X0212-1990 (iso 159)
-
-
- 3.27. it Italian
-
- Required characters
-
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- a! 00e0 LATIN SMALL LETTER A WITH GRAVE
- e! 00e8 LATIN SMALL LETTER E WITH GRAVE
- i! 00ec LATIN SMALL LETTER I WITH GRAVE
- o! 00f2 LATIN SMALL LETTER O WITH GRAVE
-
- Optional characters
-
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- u! 00f9 LATIN SMALL LETTER U WITH GRAVE
- i: 00ef LATIN SMALL LETTER I WITH DIAERESIS
-
- Comments
-
- The accented characters appear only in the lower case variant in
- the Italian version of ISO 646 (ISO-IR-15).
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 24]
-
- draft Languages and character sets Mar 93
-
-
- ISO_8859-3:1988 (iso 109)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- JIS_X0212-1990 (iso 159)
-
- Character sets covering the required characters only
-
- GB_2312-80 (iso 58)
-
-
- 3.28. ?? Rhaetian
-
- Required characters
-
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- a! 00e0 LATIN SMALL LETTER A WITH GRAVE
- e! 00e8 LATIN SMALL LETTER E WITH GRAVE
- o! 00f2 LATIN SMALL LETTER O WITH GRAVE
- a> 00e2 LATIN SMALL LETTER A WITH CIRCUMFLEX
- e> 00ea LATIN SMALL LETTER E WITH CIRCUMFLEX
- i> 00ee LATIN SMALL LETTER I WITH CIRCUMFLEX
- o> 00f4 LATIN SMALL LETTER O WITH CIRCUMFLEX
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- T.61-8bit (iso 103)
- ISO_8859-3:1988 (iso 109)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- JIS_X0212-1990 (iso 159)
-
-
- 3.29. ro Romanian
-
- Required characters
-
- a> 00e2 LATIN SMALL LETTER A WITH CIRCUMFLEX
- i> 00ee LATIN SMALL LETTER I WITH CIRCUMFLEX
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 25]
-
- draft Languages and character sets Mar 93
-
-
- a( 0103 LATIN SMALL LETTER A WITH BREVE
- s, 015f LATIN SMALL LETTER S WITH CEDILLA
- t, 0163 LATIN SMALL LETTER T WITH CEDILLA
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-2:1987 (iso 101)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- CSN_369103 (iso 139)
- ISO_6937-2-add (iso 142)
- JIS_X0212-1990 (iso 159)
-
-
- 3.30. hu Hungarian
-
- Required characters
-
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
- o" 0151 LATIN SMALL LETTER O WITH DOUBLE ACUTE
- u" 0171 LATIN SMALL LETTER U WITH DOUBLE ACUTE
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-2:1987 (iso 101)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- CSN_369103 (iso 139)
- ISO_6937-2-add (iso 142)
- JIS_X0212-1990 (iso 159)
-
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 26]
-
- draft Languages and character sets Mar 93
-
-
- 3.31. sq Albanian
-
- Required characters
-
- e: 00eb LATIN SMALL LETTER E WITH DIAERESIS
- c, 00e7 LATIN SMALL LETTER C WITH CEDILLA
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-1:1987 (iso 100)
- ISO_8859-2:1987 (iso 101)
- T.61-8bit (iso 103)
- ISO_8859-3:1988 (iso 109)
- CSA_Z243.4-1985-gr (iso 123)
- T.101-G2 (iso 128)
- CSN_369103 (iso 139)
- ISO_6937-2-add (iso 142)
- ISO_8859-9:1989 (iso 148)
- JIS_X0212-1990 (iso 159)
-
-
- 3.32. tr Turkish
-
- Required characters
-
- a> 00e2 LATIN SMALL LETTER A WITH CIRCUMFLEX
- i> 00ee LATIN SMALL LETTER I WITH CIRCUMFLEX
- u> 00fb LATIN SMALL LETTER U WITH CIRCUMFLEX
- o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
- u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
- i. 0131 LATIN SMALL LETTER I WITH NO DOT
- c, 00e7 LATIN SMALL LETTER C WITH CEDILLA
- s, 015f LATIN SMALL LETTER S WITH CEDILLA
- g( 011f LATIN SMALL LETTER G WITH BREVE
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- T.61-8bit (iso 103)
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 27]
-
- draft Languages and character sets Mar 93
-
-
- ISO_8859-3:1988 (iso 109)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- ISO_8859-9:1989 (iso 148)
-
-
- 3.33. hr Croatian
-
- Required characters
-
- c' 0107 LATIN SMALL LETTER C WITH ACUTE
- d/ 0111 LATIN SMALL LETTER D WITH STROKE
- c< 010d LATIN SMALL LETTER C WITH CARON
- s< 0161 LATIN SMALL LETTER S WITH CARON
- z< 017e LATIN SMALL LETTER Z WITH CARON
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-2:1987 (iso 101)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- CSN_369103 (iso 139)
- JUS_I.B1.002 (iso 141)
- ISO_6937-2-add (iso 142)
- JIS_X0212-1990 (iso 159)
-
-
- 3.34. sl Slovenian
-
- Required characters
-
- c< 010d LATIN SMALL LETTER C WITH CARON
- s< 0161 LATIN SMALL LETTER S WITH CARON
- z< 017e LATIN SMALL LETTER Z WITH CARON
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-2:1987 (iso 101)
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 28]
-
- draft Languages and character sets Mar 93
-
-
- T.61-8bit (iso 103)
- ISO_8859-4:1988 (iso 110)
- T.101-G2 (iso 128)
- CSN_369103 (iso 139)
- JUS_I.B1.002 (iso 141)
- ISO_6937-2-add (iso 142)
- latin6 (iso 157)
- JIS_X0212-1990 (iso 159)
-
-
- 3.35. sk Slovak
-
- Required characters
-
- y' 00fd LATIN SMALL LETTER Y WITH ACUTE
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- a: 00e4 LATIN SMALL LETTER A WITH DIAERESIS
- o> 00f4 LATIN SMALL LETTER O WITH CIRCUMFLEX
- l' 013a LATIN SMALL LETTER L WITH ACUTE
- r' 0155 LATIN SMALL LETTER R WITH ACUTE
- c< 010d LATIN SMALL LETTER C WITH CARON
- d< 010f LATIN SMALL LETTER D WITH CARON
- l< 013e LATIN SMALL LETTER L WITH CARON
- n< 0148 LATIN SMALL LETTER N WITH CARON
- s< 0161 LATIN SMALL LETTER S WITH CARON
- t< 0165 LATIN SMALL LETTER T WITH CARON
- z< 017e LATIN SMALL LETTER Z WITH CARON
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-2:1987 (iso 101)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- CSN_369103 (iso 139)
- ISO_6937-2-add (iso 142)
-
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 29]
-
- draft Languages and character sets Mar 93
-
-
- 3.36. cs Czech
-
- Required characters
-
- y' 00fd LATIN SMALL LETTER Y WITH ACUTE
- a' 00e1 LATIN SMALL LETTER A WITH ACUTE
- e' 00e9 LATIN SMALL LETTER E WITH ACUTE
- i' 00ed LATIN SMALL LETTER I WITH ACUTE
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- u' 00fa LATIN SMALL LETTER U WITH ACUTE
- e< 011b LATIN SMALL LETTER E WITH CARON
- u0 016f LATIN SMALL LETTER U WITH RING ABOVE
- c< 010d LATIN SMALL LETTER C WITH CARON
- d< 010f LATIN SMALL LETTER D WITH CARON
- n< 0148 LATIN SMALL LETTER N WITH CARON
- r< 0159 LATIN SMALL LETTER R WITH CARON
- s< 0161 LATIN SMALL LETTER S WITH CARON
- t< 0165 LATIN SMALL LETTER T WITH CARON
- z< 017e LATIN SMALL LETTER Z WITH CARON
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-2:1987 (iso 101)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- CSN_369103 (iso 139)
- ISO_6937-2-add (iso 142)
-
-
- 3.37. pl Polish
-
- Required characters
-
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- a; 0105 LATIN SMALL LETTER A WITH OGONEK
- e; 0119 LATIN SMALL LETTER E WITH OGONEK
- c' 0107 LATIN SMALL LETTER C WITH ACUTE
- n' 0144 LATIN SMALL LETTER N WITH ACUTE
- s' 015b LATIN SMALL LETTER S WITH ACUTE
- z' 017a LATIN SMALL LETTER Z WITH ACUTE
- l/ 0142 LATIN SMALL LETTER L WITH STROKE
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 30]
-
- draft Languages and character sets Mar 93
-
-
- z. 017c LATIN SMALL LETTER Z WITH DOT ABOVE
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-2:1987 (iso 101)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- CSN_369103 (iso 139)
- ISO_6937-2-add (iso 142)
- JIS_X0212-1990 (iso 159)
-
-
- 3.38. ?? Sorbian
-
- Required characters
-
- o' 00f3 LATIN SMALL LETTER O WITH ACUTE
- e< 011b LATIN SMALL LETTER E WITH CARON
- c' 0107 LATIN SMALL LETTER C WITH ACUTE
- n' 0144 LATIN SMALL LETTER N WITH ACUTE
- s' 015b LATIN SMALL LETTER S WITH ACUTE
- z' 017a LATIN SMALL LETTER Z WITH ACUTE
- l/ 0142 LATIN SMALL LETTER L WITH STROKE
- c< 010d LATIN SMALL LETTER C WITH CARON
- r< 0159 LATIN SMALL LETTER R WITH CARON
- s< 0161 LATIN SMALL LETTER S WITH CARON
- z< 017e LATIN SMALL LETTER Z WITH CARON
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- ISO_8859-2:1987 (iso 101)
- T.61-8bit (iso 103)
- T.101-G2 (iso 128)
- CSN_369103 (iso 139)
- ISO_6937-2-add (iso 142)
-
-
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 31]
-
- draft Languages and character sets Mar 93
-
-
- 3.39. eo Esperanto
-
- Required characters
-
- u( 016d LATIN SMALL LETTER U WITH BREVE
- c> 0109 LATIN SMALL LETTER C WITH CIRCUMFLEX
- g> 011d LATIN SMALL LETTER G WITH CIRCUMFLEX
- h> 0125 LATIN SMALL LETTER H WITH CIRCUMFLEX
- j> 0135 LATIN SMALL LETTER J WITH CIRCUMFLEX
- s> 015d LATIN SMALL LETTER S WITH CIRCUMFLEX
-
- Character sets covering the whole
-
- videotex-suppl (iso 70)
- iso-ir-90 (iso 90)
- ANSI_X3.110-1983 (iso 99)
- T.61-8bit (iso 103)
- ISO_8859-3:1988 (iso 109)
- T.101-G2 (iso 128)
- ISO_6937-2-add (iso 142)
- ISO_8859-supp (iso 154)
- JIS_X0212-1990 (iso 159)
-
-
- 4. Other languages with appropriate character sets
- Other languages for which appropriate character sets are known are
- listed in the table below.
-
- Language Character set
-
- ar Arabic ISO-8859-6
- be Byelorussian ISO-8859-5
- bg Bulgarian ISO-8859-5
- el Greek ISO-8859-7
- en English USASCII
- fa Persian ISO-8859-6
- iw Hebrew ISO-8859-8
- ja Japanese ISO-IR-87 (Japanese JIS C6226-1983)
- ko Korean ISO-IR-149 (Korean KS C 5601-1989)
- la Latin USASCII
- lo Laotian ISO-IR-166
- ru Russian ISO-8859-5
- sw Swahili USASCII
- th Thai ISO-IR-166
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 32]
-
- draft Languages and character sets Mar 93
-
-
- uk Ukrainian ISO-8859-5
- ur Urdu ISO-8859-6
- vo Volapuk ISO-8859-1
- zh Chinese ISO-IR-58 (Chinese GB 2312-80)
-
- Additional entries in this table are welcome!
-
-
- 4.1. ISO 10646 only languages
-
- The following languages can (to the author's limited knowledge) be
- written with the current ISO 10646 standard, but with no other
- registered character sets:
-
-
- Language Country(ies) Script(s)
-
- aa Afar Somalia, Ethiopia, Djibouti Latin
- ab Abkhazian Georgia Cyrillic
- am Amharic Ethiopia Ethiopic
- as Assamese India, Nepal Bengali
- ay Aymara Bolivia, Peru, Chile Latin
- az Azerbaijani SNC, Iran, Iraq, Turkey Cyrillic, Arabic
- ba Bashkir SNC Cyrillic
- bh Bihari India Gujarati (or Kaithi)
- bi Bislama Vanuatu, New Caledonia Latin
- bn Bengali India Bengali
- co Corsican France Latin
- fj Fiji Fiji Latin
- gd Scots UK Latin
- gn Guarani Paraguay Latin
- gu Gujarati India Gujarati
- ha Hausa Nigeria, Niger, Chad, Sudan,... Latin
- hi Hindi India Devanagari
- hy Armenian Armenia Armenian
- ia Interlingua None (Artificial Language) Latin
- ie Interlingue None (Artificial Language) Latin
- ik Inupiak USA, Cannada Latin, Cree
- in Indonesian Indonesia Latin
- ji Yiddish Germany, USA, SNC, Israel Hebrew
- jw Javanese Indonesia, Malaysia Latin, Javanese
- ka Georgian Georgia Georgian
- kk Kazakh SNC, Afghanistan Cyrillic, Arabic
- km Cambodian Cambodia Khmer
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 33]
-
- draft Languages and character sets Mar 93
-
-
- kn Kannada India Kannada
- ks Kashmiri India, Pakistan Arabic
- ku Kurdish SNC, Turkey, Iraq, Iran Cyrillic, Arabic
- ky Kirghiz SNC, China, Afghanistan Cyrillic, Arabic
- ln Lingala CAR, Congo, Zaire Latin
- mg Malagasy Madagascar, Comoro Islands Latin, Arabic
- mi Maori New Zealand Latin
- mk Macedonian Greece, Yugoslavia Greek, Cyrillic
- ml Malayalam India Malayalam
- mn Mongolian Mongolia Cyrillic, Mongolian
- mo Moldavian Romania Latin
- mr Marathi India Devanagari
- ms Malay Malaysia, Thailand Latin
- my Burmese Myanmar Burmese
- na Nauru Nauru Latin
- ne Nepali Nepal Devanagari
- oc Occitan France Latin
- or Oriya India Oriya
- pa Punjabi India Gurmukhi
- ps Pashto (Western) Afghanistan, Iran Arabic
- qu Quechua Peru Latin
- rm Rhaeto Swizerland Latin
- rn Kirundi Burundi, Uganda Latin
- rw Kinyarwanda Rwanda, Uganda, Zaire Latin
- sa Sanskrit India Devanagari
- sd Sindhi Pakistan, India, Afghanistan Arabic, Gurmukhi
- sg Sangro Central African Republic Latin
- si Singhalese Sri Lanka Sinhalese
- sm Samoan Samoa, USA, New Zealand Latin
- sn Shona Zimbabwe, Zambia, Mozambique Latin
- so Somali Somalia, Ethiopia, Djibouti Latin
- sr Serbian former Yugoslavia Cyrillic
- ss Siswati S. Africa, Swaziland Latin
- st Sesotho S. Africa, Lesotho Latin
- su Sudanese Sudan Latin
- ta Tamil India, Malaysia Tamil
- te Tegulu India Telugu
- tg Tajik Tajikistan Arabic
- ti Tigrinya Ethiopia Latin, Ethiopic
- tk Turkmen SNC, Iran, Afghanistan Cyrillic, Arabic
- tl Tagalog Phillipines Latin
- tn Setswana S. Africa, Botswana, Namibia Latin
- to Tonga (3) Mozambique Latin
- ts Tsonga Mozambique, Swaziland Latin
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 34]
-
- draft Languages and character sets Mar 93
-
-
- tt Tatar SNC Cyrillic
- tw Twi (Ewe) Ghana Latin
- uz Uzbek (Southern) Afghanistan, Turkey Arabic
- vi Vietnamese Vietnam, Cambodia, China Latin
- wo Wolof Senegal, Mauritania Latin
- xh Xhosa S. Africa Latin
- yo Yoruba Nigeria, Togo, Benin Latin
- zu Zulu S. Africa, Lesotho, Malawi Latin
-
-
- The information about languages in ISO 10646 was kindly supplied
- by Glenn Adams <glenn@metis.com>
-
- Languages for which the author does NOT know any proper character
- set include:
-
-
- bo Tibetan
- dz Bhutani
- et Estonian
- lt Lithuanian
- lv Latvian, Lettish
- mt Maltese
- sh Serbo-Croatian
-
-
-
- 5. Encoded format of charset data
-
- This section contains, in a very compact format, all the
- information used to make the technical content of this RFC, apart
- from the content of ISO 639 and RFC 1345.
-
- It would be helpful if new information was also supplied in this
- format.
-
-
- # A list of languages and their required/optional characters.
- # Format:
- # &language Name
- # Required characters
- # Important characters
- # Comments
- &language Lithuanian
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 35]
-
- draft Languages and character sets Mar 93
-
-
- a; e; i; u; e. u- c< s< z<
-
- &language Latvian
- a- e- i- o- u- g, k, l, n, r, c< s< z<
-
- &language Estonian
- o? a: o: u: s< z<
-
- &language Finnish
- a: o:
-
- &language Sami
- a' e' a> a: e: i: o: u: ae aa o/ d/ n' ng t/ c< s< z<
-
- &language Swedish
- a: o: aa
- a' e' e: u:
-
- &language Norwegian
- ae aa o/
- e' o' o>
-
- &language Danish
- ae aa o/
- a' e' i' o' u' y'
-
- &language Faeroese
- a' i' o' u' y' ae o/ d-
-
- &language Icelandic
- a' e' i' o' u' y' o: ae d- th
-
- &language Greenlandic
- a' e' i' u' a> e> i> o> u> ae aa o/ a? i? u? kk
-
- &language Gaelic
- a' e' o' a! e! i! o! u!
-
- &language Irish
- a' e' i' o' u'
-
- &language Welsh
- w' y' a' e' i' o' u' a! e! i! o! u! w! y! a> e> i> o> u> w> y> a: e: i: o: u: w: y:
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 36]
-
- draft Languages and character sets Mar 93
-
-
- &language Breton
- e> u! u: n?
-
- &language Frisian
- e' u' a> e> o> u> a: e: i: o: u:
-
- &language Dutch
- a' e' i' o' u' a: e: i: o: u: ij
-
- &language Afrikaans
- a' e' e! a> e> i> o> u> e: i: o: 'n
-
- &language German
- a: o: u: ss
- e' a!
- The "ss" character exists only in lower case; the upper case equivalent
- is "SS" (2 letters).
-
- &language French
- e' e! u! c, a!
- a> e> i> o> u> ae oe e: i: u: y:
-
- &language Catalan
- e' i' o' u' a! e! o! i: u: l. n?
-
- &language Spanish
- n? c, !I ?I
- a' e' i' o' u' u: n?
- Note that this language also uses special punctuation marks.
- The c, appears in ISO 646-ES, but not in van Wingen's tables.
-
-
- &language Galician
- a' e' i' o' u' u: n?
-
- &language Portuguese
- a? o? c,
- a' e' i' o' u' a! a> e> o> u:
-
- &language Basque
- n? c,
-
- &language Maltese
- a! e! i! o! u! i> c. g. h/ z.
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 37]
-
- draft Languages and character sets Mar 93
-
-
- &language Italian
- e' o' a! e! i! o!
- i' u' u! i:
- The accented characters appear only in the lower case variant in
- the Italian version of ISO 646 (ISO-IR-15).
-
- &language Rhaetian
- e' a! e! o! a> e> i> o> o: u:
-
- &language Romanian
- a> i> a( s, t,
-
- &language Hungarian
- a' e' i' o' u' o: u: o" u"
-
- &language Albanian
- e: c,
-
- &language Turkish
- a> i> u> o: u: i. c, s, g(
-
- &language Croatian
- c' d/ c< s< z<
-
- &language Slovenian
- c< s< z<
-
- &language Slovak
- y' a' e' i' o' u' a: o> l' r' c< d< l< n< s< t< z<
-
- &language Czech
- y' a' e' i' o' u' e< u0 c< d< n< r< s< t< z<
-
- &language Polish
- o' a; e; c' n' s' z' l/ z.
-
- &language Sorbian
- o' e< c' n' s' z' l/ c< r< s< z<
-
- &language Esperanto
- u( c> g> h> j> s>
-
-
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 38]
-
- draft Languages and character sets Mar 93
-
-
- 6. REFERENCES
-
-
- [ISO 8859]
- Information technology - 8-bit single-byte coded graphic
- character sets
-
- [ISO 6937]
- Information processing - Coded graphic character set for text
- communication
-
- [ISO 639]
- Codes for identifying languages (1988 version)
-
- [ISO 10646]
- Information technology - Universal Multiple-Octet Coded
- Character Set
-
- [RFC-KELD]
- Keld Simonsen: Character Mnemonics & Character Sets, RFC
- 1345, June 1992
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 39]
-
- draft Languages and character sets Mar 93
-
-
- Table of Contents
-
-
- Abstract ................................................... 1
- Status of this Memo ........................................ 1
- 1 Introduction .............................................. 2
- 2 Introduction to language tables ........................... 2
- 2.1 Table structure ......................................... 3
- 2.2 Sources utilized ........................................ 4
- 2.3 What accents mean ....................................... 4
- 3 Language tables ........................................... 5
- 3.1 lt Lithuanian ........................................... 5
- 3.2 lv Latvian .............................................. 6
- 3.3 et Estonian ............................................. 7
- 3.4 fi Finnish .............................................. 7
- 3.5 ?? Sami ................................................. 8
- 3.6 sv Swedish .............................................. 9
- 3.7 no Norwegian ............................................ 10
- 3.8 da Danish ............................................... 10
- 3.9 fo Faeroese ............................................. 11
- 3.10 is Icelandic ........................................... 12
- 3.11 kl Greenlandic ......................................... 12
- 3.12 ?? Gaelic .............................................. 13
- 3.13 ga Irish ............................................... 14
- 3.14 cy Welsh ............................................... 14
- 3.15 br Breton .............................................. 15
- 3.16 fy Frisian ............................................. 16
- 3.17 nl Dutch ............................................... 16
- 3.18 af Afrikaans ........................................... 17
- 3.19 de German .............................................. 18
- 3.20 fr French .............................................. 19
- 3.21 ca Catalan ............................................. 20
- 3.22 es Spanish ............................................. 20
- 3.23 gl Galician ............................................ 21
- 3.24 pt Portuguese .......................................... 22
- 3.25 eu Basque .............................................. 23
- 3.26 mt Maltese ............................................. 23
- 3.27 it Italian ............................................. 24
- 3.28 ?? Rhaetian ............................................ 25
- 3.29 ro Romanian ............................................ 25
- 3.30 hu Hungarian ........................................... 26
- 3.31 sq Albanian ............................................ 27
- 3.32 tr Turkish ............................................. 27
- 3.33 hr Croatian ............................................ 28
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 40]
-
- draft Languages and character sets Mar 93
-
-
- 3.34 sl Slovenian ........................................... 28
- 3.35 sk Slovak .............................................. 29
- 3.36 cs Czech ............................................... 30
- 3.37 pl Polish .............................................. 30
- 3.38 ?? Sorbian ............................................. 31
- 3.39 eo Esperanto ........................................... 32
- 4 Other languages with appropriate character sets ........... 32
- 4.1 ISO 10646 only languages ................................ 33
- 5 Encoded format of charset data ............................ 35
- 6 REFERENCES ................................................ 39
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Alvestrand Expires Dec 17 93 [Page 41]
-
-
-